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IN THE ABSTRACT 

Please replace the originally-filed Abstract with the accompanying Abstract, which 
complies with size constraints by comprising 140 words. 

IN THE CLAIMS: 

1 . (Cancel) A software program for providing instructions to a processor which controls a 
system for applying actor-critic based fuzzy reinforcement learning, comprising: 

a database of fuzzy-logic rules for mapping input data to output commands for mapping 
a system state; and 

a reinforcement learning algorithm for updating the fuzzy-logic rules database based on 
effects on the system state of the output commands mapped from the input data, and 

wherein the reinforcement learning algorithm is configured to converge at least one 
parameter of the system state towards at least approximately an optimum value following 
multiple mapping and updating iterations 

2. (Cancel) The software program of Claim 1, wherein the reinforcement learning algorithm 
is based on an update equation including a derivative with respect to said at least one parameter 
of a logarithm of a probability function for taking a selected action when a selected state is 
encountered. 

3. (Cancel) The software program of claim 2, wherein the reinforcement learning algorithm 
is configured to update the at least one parameter based on said update equation. 

4. (Cancel) The software program of any of claims 1-3, wherein the system includes a 
wireless transmitter. 

5. (Currently amended) A method of controlling a system including a processor for 
applying actor-critic based fuzzy reinforcement learning to perform power control in a wireless 
transmitter, comprising the [operations] acts of: 
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mapping input data to output commands for modifying a system * 
state according to fuzzy-logic rules; 

using continuous, reinforcement learning , updating the fuzzy-logic rules based on effects 
on the system state of the output commands mapped from the input data; and 

converging at least one parameter of the system state towards at 
least approximately an optimum value following multiple mapping and 
updating iterations. 

6. (Currently amended) The method of claim 5, wherein [the] updating [operation] includes 
taking a derivative with respect to said at least one parameter of a logarithm of a probability 
function for taking a selected action when a selected state is encountered. 

7. (Currently amended) The method of claim 6, wherein [the] updating [operation] includes 
updating the at least one parameter based on said derivative. 

8. (Cancel) The method of any of claims 5-7, wherein the system includes a wireless 
transmitter. 

9. (Cancel) A system controlled by an actor-critic based fuzzy reinforcement learning 
algorithm which provides instructions to a processor of the system for applying actor-critic 
based fuzzy reinforcement learning, comprising: 

the processor; 

at least one system component whose actions are controlled by said processor; 
at least one storage medium accessible by said processor, including data stored therein 
corresponding to: 

a database of fuzzy-logic rules for mapping input data to output commands for 
modifying a system state; and 
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a reinforcement learning algorithm for updating the fuzzy-logic rules database 
based on effects on the system state of the output commands mapped from the input 
data, and 

wherein the reinforcement learning algorithm is configured to converge at least one 
parameter of the system state towards at least approximately an optimum value following 
multiple mapping and updating iterations. 

10. (Cancel) The system of claim 9, wherein the reinforcement learning algorithm is based on 
an update equation including a derivative with respect to said at least one parameter of a 
logarithm of a probability function for taking a selected action when a selected state is 
encountered. 

1 1 . (Cancel) The system of claim 1 0, wherein the reinforcement learning algorithm is 
configured to update the at least one parameter based on said update equation. 

12. (Cancel) The system of any of claims 9-1 1 , wherein said at least one system component 
comprises a wireless transmitter. 

13. (New) A computer-readable medium containing instructions which, when executed by a 
computer, control a system for applying actor-critic based fuzzy reinforcement learning, by: 

maintaining a database of fuzzy-logic rules for mapping input data to output commands 
for modifying a system state by using continuous, reinforcement learning to update the fuzzy- 
logic rules database based on effects on the system state of the output commands to control a 
wireless transmitter, the output commands mapped from the input data; and 

converging at least one parameter of the system state towards at least approximately an 
optimum value following multiple mapping and updating iterations. 
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14. (New) The computer-readable medium of claim 13, wherein updating the fuzzy-logic 

- database comprises utilizing a derivative with respect to said at least one parameter of a logarithm 
of a probability function for taking a selected action when a selected state is encountered. 

15. (New) The computer-readable medium of claim 14, wherein the at least one parameter is 
updated utilizing the derivative with respect to said at least one parameter of a logarithm of a 
probability function for taking a selected action when a selected state is encountered. 

16. (New) The computer readable medium of any of claims 13-15, wherein the system state 
comprises system state of a wireless transmitter. 

17. (New) A system controlled by actor-critic based fuzzy reinforcement learning, 
comprising: 

a processor; 

at least one system component whose actions are controlled by the processor; and 
instructions, which, when executed by the processor: 

maintain a database of fuzzy-logic rules for mapping input data to output commands for 
modifying a system state by using continuous, reinforcement learning to update the fuzzy-logic 
rules database based on effects on the system state of the output commands mapped from the 
input data; and 

converge at least one parameter of the system state towards at least approximately an 
optimum value following multiple mapping and updating iterations. 

18. (New) The system of claim 18, wherein updating the fuzzy-logic database comprises 
utilizing a derivative with respect to said at least one parameter of a logarithm of a probability 
function for taking a selected action when a selected state is encountered. 
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19. (New) The system of claim 1 8, wherein the at least one parameter is updated utilizing the 
• derivative with respect to said at least one parameter of a logarithm of a probability function for 

taking a selected action when a selected state is encountered. 

20. (New) The system of any of claims 17-19, wherein the system state comprises system 
state of a wireless transmitter. 
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